LearningPinocchio: adaptive information extraction for real world applications
نویسندگان
چکیده
The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocchio, a system for adaptive Information Extraction from texts that is having good commercial and scientific success. Real world applications have been built and evaluation licenses have been released to external companies for application development. In this paper we outline the basic algorithm behind the scenes and present a number of applications developed with LearningPinocchio. Then we report about an evaluation performed by an independent company. Finally we discuss the general suitability of this IE technology for real world applications and draw some conclusion.
منابع مشابه
Learning to Tag for Information Extraction from Text
LearningPINOCCHIO is an algorithm for adaptive information extraction. It learns template filling rules that insert SGML tags into texts. LearningPINOCCHIO is based on a covering algorithm that learns rules by bottom-up generalization of instances in a tagged corpus. It has been tested on three scenarios in informal domains in two languages (Italian and English). Experiments report excellent re...
متن کاملFingerprint Core and Delta Detection by Candidate Analysis
In many real-world applications such as face recognition and mobile robotics, we need to use an adaptive version of feature extraction techniques. In this paper, we introduce an adaptive face recognition system based on PCA algorithm. We combine Sanger’s adaptive algorithm for computation of effective eigenvectors with QR decomposition algorithm where used to estimate associated eigenvalues. By...
متن کاملA review of agent-based modeling (ABM) concepts and some of its main applications in management science
We live in a very complex world where we face complex phenomena such as social norms and new technologies. To deal with such phenomena, social scientists often use reductionism approach where they reduce them to some lower-lever variables and model the relationships among them through a scheme of equations. This approach that is called equation based modeling (EBM) has some basic weaknesses in ...
متن کاملAdaptive Information Extraction from Text by Rule Induction and Generalisation
(LP) 2 is a covering algorithm for adaptive Information Extraction from text (IE). It induces symbolic rules that insert SGML tags into texts by learning from examples found in a user-defined tagged corpus. Training is performed in two steps: initially a set of tagging rules is learned; then additional rules are induced to correct mistakes and imprecision in tagging. Induction is performed by b...
متن کاملTarget Tracking with Unknown Maneuvers Using Adaptive Parameter Estimation in Wireless Sensor Networks
Abstract- Tracking a target which is sensed by a collection of randomly deployed, limited-capacity, and short-ranged sensors is a tricky problem and, yet applicable to the empirical world. In this paper, this challenge has been addressed a by introducing a nested algorithm to track a maneuvering target entering the sensor field. In the proposed nested algorithm, different modules are to fulfill...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Natural Language Engineering
دوره 10 شماره
صفحات -
تاریخ انتشار 2004